Dropping Null Values

Defining date and time function

Change date and time data types to date and time function

Separate the Day and Month

Drop the Date_of_Journey column as it is not necessary anymore

Defining new functions to extract hours, minutes, and drop columns from the Dep_Time

Extract hours, minutes, and drop column from the Arrival_Time

Split the duration

Adding 0h and 0m to the duration

Defining hour and min function to split the hour and min

Drop the Duration column as it is not necessary anymore

Check the data types

Change Duration_hours and Duration_mins's data type into intergers

Separate between Categorical and Continual data

Count the sum of all airplane in the data

Create the boxplot about the airplaine price

Create a boxplot about price according to the total stops

Create a dummy variable in the Airline categorical data

Count the sum of Source

Create a catplot regarding the price according to the source

Create a dummy variable in the Source categorical data

Create a dummy variable in the Destination categorical data

Split the Route categorical data

Fill the NA Variable in the route columns

Import the Label Encoder from the Sklearn preprocessing library

Input the label encoder to the categorical data

drop the route and additional_info column as it is not necessary anymore

Count the Total_Stops value

get the Total_Stops unique value

Create a dictionary about the Total_Stops

Concatinate the Categorical and Continuous data

Dealing with Outliers

Separating Independent and Dependent Data

Feature Selection

Why to apply Feature Selection?\n",

 To select important features to get rid of curse of dimensionality ie..to get rid of duplicate features

I wanted to find mutual information scores or matrix to get to know about the relationship between all features.

Feature selection using Information Gain

List the most important variables from the most to the least

Split the train and test data with sklearn

Dumping model using Pickle

Import Random Forest Class

Play with Multiple Algorithm

reg_rf = RandomForestRegressor()

Hyperparameter Turning

Assigning Hyperparameters